首页> 外文OA文献 >Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis
【2h】

Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis

机译:基于语音的交互中的多模态情感识别使用面部表情,身体姿势和声学分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this paper a study on multimodal automatic emotion recognition during a speech-based interaction is presented. A database was constructed consisting of people pronouncing a sentence in a scenario where they interacted with an agent using speech. Ten people pronounced a sentence corresponding to a command while making 8 different emotional expressions. Gender was equally represented, with speakers of several different native languages including French, German, Greek and Italian. Facial expression, gesture and acoustic analysis of speech were used to extract features relevant to emotion. For the automatic classification of unimodal data, bimodal data and multimodal data, a system based on a Bayesian classifier was used. After performing an automatic classification of each modality, the different modalities were combined using a multimodal approach. Fusion of the modalities at the feature level (before running the classifier) and at the results level (combining results from classifier from each modality) were compared. Fusing the multimodal data resulted in a large increase in the recognition rates in comparison to the unimodal systems: the multimodal approach increased the recognition rate by more than 10% when compared to the most successful unimodal system. Bimodal emotion recognition based on all combinations of the modalities (i.e., 'face-gesture', 'face-speech' and 'gesture-speech') was also investigated. The results show that the best pairing is 'gesture-speech'. Using all three modalities resulted in a 3.3% classification improvement over the best bimodal results. © OpenInterface Association 2009.
机译:本文提出了一种基于语音的交互过程中多模式自动情感识别的研究。构建了一个数据库,该数据库由人们在使用语音与特工进行交互的情况下发音的句子组成。十个人在做出8种不同的情感表达时发音了与命令相对应的句子。性别平等地得到了代表,讲法语,德语,希腊语和意大利语的几种不同的母语。使用面部表情,手势和语音声学分析来提取与情感相关的特征。对于单峰数据,双峰数据和多峰数据的自动分类,使用了基于贝叶斯分类器的系统。在对每个形态进行自动分类之后,使用多形态方法将不同形态组合在一起。比较了特征级别(在运行分类器之前)和结果级别(来自每个模态的分类器结果组合)中的模态融合。与单峰系统相比,融合多峰数据导致识别率大大提高:与最成功的单峰系统相比,多峰方法将识别率提高了10%以上。还研究了基于模态的所有组合(即``面部手势'',``面部语音''和``手势语音'')的双峰情感识别。结果表明,最佳配对是“手势语音”。与最佳双峰结果相比,使用这三种模态均导致3.3%的分类改进。 ©OpenInterface协会2009。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号